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Abstract 

The  problem  of  combining  probabilities  occurs  principally  in  com- 
bining one-sided  independent  tests,  and  in  testing  the  simple  hypothesis 
of  goodness-of-fit.  The  mechanics  of  using  the  logit  statistic,  which 
is  the  sum  of  the  logits  of  the  probabilities,  in  the  above  two  and  other 
applications  is  described.  Several  approaches  to  studying  the  classical 
combination  statistics  are  discussed  and  the  logit  statistic  is  reviewed 
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1.  INTRODUCTION  AND  SIM^Y.  A scientific  inquiry  into  any  major 
problem  in  general  consists  of  several  independent  investigations  separated 
in  time  and  space  and  differing  in  quantitative  and  possibly  qualitative 
aspects  of  design.  Very  often  the  results  of  the  individual  investigations 
are  inconclusive,  and  it  becomes  necessary  to  pool  the  diverse  pieces  of 
evidence.  If  the  problem  concerns  the  truth  or  falsity  of  a scientific 
hypothesis,  then  in  different  investigations  it  may  be  formulated  ais  differ- 
ent statistical  hypotheses,  and  appropriate  tests  of  significance  are  applied. 
The  aggregate  of  these  tests,  possibly  of  marginal  significance  individually, 
can  lead  to  scientifically  decisive  conclusions  if  their  results  are  viewed 
as  a whole.  In  scientific  reporting  the  most  common  device  used  for  sum- 
marizing results  of  tests  of  significance  is  their  P-values.  The  P-values, 
also  known  as  the  significance  probabilities,  are  simple  to  interpret 
marginally  and  the  search  for  a suitable  combination  statistic,  i.e.,  a 
compound  of  the  P-values,  on  which  to  predicate  an  objective  judgment 
about  the  basic  problem  is  the  subject  of  the  theory  of  combining  tests. 

Even  though  many  of  the  classical  combination  statistics  were 
introduced  and  are  most  conanonly  used  for  the  purpose  of  combining  tests 
and  the  greater  bulk  of  literature  on  their  study  has  grown  around  this 
aspect,  the  simple  hypothesis  of  goodness-of-fit  has  been  an  equally  strong 
motivation  behind  them.  Indeed,  at  an  early  stage  of  their  development  E.S. 
Pearson  (1938)  explicitly  discussed  the  dual  role  of  combination  statistics, 
and  almost  parallel  streams  of  articles  on  these  two  applications  have 
developed  over  the  past  forty-five  years.  The  well  known  uniform  distribu- 
tion resulting  out  of  the  probability  integral  transformation  is  the  common 


! 
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core  of  the  twin  applications.  Because  of  it  the  simple  hypothesis  of 
fit  in  canonical  form  is  the  hypothesis  of  uniformity,  and  the  null 
distribution  of  the  P-values  is  uniform.  The  combination  statistics  have 
also  been  discussed  in  the  solutions  to  several  other  problems,  e.g.  the 
two  sample  problem,  testing  composite  hypotheses  of  fit,  and  several 
testing  of  hypothesis  problems  in  multivariate  analysis.  The  role  of  the 
statistics  in  these  solutions  stems  from  the  two  basic  applications 
mentioned  earlier,  namely  the  combination  of  tests  and  the  goodness-of-fit. 
However,  the  study  of  these  statistics,  which  must  perform  differently 
in  different  solutions  has,  in  the  context  of  these  problems,  just  begun. 

Specifically,  let  T^,  i - l,2,...,k  be  k independently  distri- 
buted statistics,  from  the  k investigations  with  continuous  distribution 

functions  F.  » for  testing  the  respective  null  hypotheses 
'”i 

against  respective  alternatives  0^^  > 0^g  , i * l,2,...k.  If  the 

large  values  of  T^  are  significant,  then  so  are  the  small  values  of  the 
P-values  P.  ■ 1 - F.  _ (T.).  On  the  one  hand,  the  problon  of  combining  the 

1 1.9^0  ' 

tests  is  to  find  a suitable  conAiination  statistic  ¥(P^,P2i  • • • iF]^)  testing 

k 

the  overall  null  hypothesis  H,,  - D {H- . : 0 • 0^},  the  logical  conjunc- 

w i.l  ' 

t ion  of  against  (J  t*4i'  ® ^ ' * ’®k^ 

and  ^ denotes  the  coordinatewise  partial  order,  viz:  6^  ^ for 
i > l,2,...,k  with  at  least  one  inequality  strict.  On  the  other  hand, 

given  a sa^)le  X^  from  a population  with  distribution  function 

(d.f.)  F , the  simple  hypothesis  of  goodness-of-fit  is  F > Fg  , where 

Fg  is  a specified  d.f..  Under  Hg,  - Fg(Xp,  i - 1,2, ...,n.  similar  to 
P^  in  the  earlier  case,  are  uniformly  distributed  so  that  a combination 
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statistic  ♦ (Yj^,Y2,.  . . ,Y^)  is  useful  for  testing  goodness-of-fit. 

The  best  known  combination  statistics  include  Ci)  the  earliest 

k 

proposed  f-.  " min  P.  due  to  H.  Tippett,  (ii)  fp  ■ Pi  ‘^tie  to  R.  A. 

i ^ k i»l  ^ 

Fisher,  (iii)  Yp  * n (1-P.)  due  to  K.  Pearson,  and  relatively  new 


(iv)  Yjj  ■ ^♦"^(1-P^),  where  ♦(•)  is  the  standard  normal  d.f. , due  to 

Liptak.  These  are  easy  to  compute  and  all  have  simple  null  distributions: 

2 

-2  1ogYp  and  -2  log  Yp  are  X2i(  " variables,  Yj^  is  normally  distributed, 
and  Y^  is  distributed  as  the  smallest  uniform  order  statistic.  Another 
statistic  of  this  sort,  termed  the  logit  statistic,  is  considered  by 
George  (1977)  and  George  and  Mudholkar  (1977b) . It  is  the  focus  of  the 
present  paper. 


i»l 


The  literature  on  the  theory  and  methodological  and  practical  appli- 
cations of  the  combination  statistics  is  substantial.  The  theory  of 
combining  tests  is  well  summarized  in  George  (1977),  and  the  monograph 
by  Oosterhoff  (1969) . The  applications  to  the  goodness-of-fit  problem 
are  described  and  reasonably  referenced  in  Chapman  (1958),  Lin  (1977), 
and  George  (1977) . The  connection  to  the  two-sample  problem  may  be  seen 
e.g.  in  Bell,  Moser  and  Thompson  (1966);  and  to  the  multivariate  testing 
problems  in  Mudholkar  and  Subbaiah  (1977) . This  essay  reviews  recent 
progress  in  the  study  of  the  logit  combination  statistics  in  various 
applications. 

The  logit  statistic  and  its  null  distribution  are  described  in  Section 
2.  In  Section  3,  the  principal  approaches  to  studying  the  methods  for 
co8ri>ining  independent  one-sided  tests  are  surveyed  and  the  logit  statistic 
is  reviewed  in  the  light  of  some  of  these  studies.  This  section  also  con- 
tains brief  illustrative  summaries  of  two  simulation  experiments  conducted 
with  a view  to  comparing  the  logit  and  the  classical  combination  statistics 


I 

i 


I 


* 


i 

i 


-4- 


for  coabining  independent  and  quasi -independent  (i.e.  independent  only 
under  Hq)  tests.  The  logit  statistic  is  considered  in  the  context  of 
the  goodness-of-fit  problen  in  Section  4.  In  the  same  section  we  present 
some  esqpirical  results  about  the  power  functions  of  the  variations  of  a 
test,  for  the  conqposite  hypothesis  of  exponent iality,  obtained  by  using 
Fisher's,  Pearson's  and  the  logit  statistics.  Finally,  Section  S is 
devoted  to  aiscellaneous  remarks  including  one  on  the  weighted  logit 
statistic. 

2.  THE  LOGIT  statistic,.  Let  »*’2*  * ' ' ’^k  ***  ^ independently  dis- 
tributed P-values,  or  their  analogues  in  the  other  ^plications  described 
in  Section  1.  Following  Berkson  (1944)  the  sum  of  the  log-odds  ratios, 
i.e.  the  logits,  of  P^ 

k 

fj^(Pj,p2,....P^)  = i log[P^/(l-P^)]  , (2.1) 

i»l 

is  termed  the  logit  statistic.  Under  the  null  hypotheses  the  P^  are 
uniform  (0,1),  the  log(Pj^/(l  - Pj^)]  are  distributed  according  to  the 
logistic  law  with  the  d.f. 

F(z)  • [1  * exp(-z)]'^  , (2.2) 

and  consequently  is  distributed  according  to  the  k-fold  convolu- 
tion Fjj(*)  of  ¥(•).  It  is  easy  to  verify  that 

1 - F2(z)  - r(e"V(l-e‘*)]^  ♦ (z  - 1)  [e"V(l  - e’*)]  (2.3) 

and  1 - F3(z)  ■ e‘V(l  ♦ e"*)  - 2ze'V(l  ♦ 

♦ (z^*e^)e'*(l -e'*)/2(l*e'*)^  . (2.4) 

George  and  Mudholkar  (1977),  by  inverting  the  Mittag-Leffler  expan- 
sion of  the  ehazact eristic  fkinetion  of  V|^  obtain  closed  fbm  expressions 


FIGURE  1.  The  Distribution  Function*  (Right  Scale)  of  the  Standardized  Logit 
Statistic,  and  the  Error  (Left  Scale)  in  its  Approxination  by  (2.5). 


he  scale  the  distribution  functions,  symetrical  about  0 
indistinguishable  for  ks2  and  ka3. 
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for  ^or  general  k.  However,  for  practical  purposes,  the  null 

distribution  of  the  logit  statistic  admits  a simple  and  very  accurate 
approx imatim  in  terms  of  a student's  t-distribution.  This  approximation 
is  suggested  by  the  fact  that  the  logistic  distribution  is  close  enough 
in  shape  to  the  normal  distribution  to  have  emerged  as  a substitute  for 
it  in  applications  such  as  bioassay.  In  case  of  this  similarity  is 
enhanced  because  of  the  central  limit  effect.  Yet  both  the  logistic 
distribution  and  its  convolution  are  heavier  tailed  than  the  normal 
distribution,  and  may  be  better  approximated  by  another  heavy  tailed 
distribution,  namely  an  appropriate  multiple  of  a student's  t-distribu- 
tion. Specifically,  letting  t^  be  the  student's  t-vsriable  with  v 
degrees  of  freedom,  it  is  proposed  that 

- ppl  « , (2.5) 

where  ^ denotes  the  equivalence  in  law,  and  where  the  degrees  of  freedom 
(5k-*-4)  and  the  scaling  constant  it are  obtained  by  equating 
the  variances  (kir^/S  and  v/  (v-2)),  and  the  coefficients  of  kurtosis 
(4*2  k and  3+6/  (v-4)),  of  the  logit  statistic  and  t^  , respectively. 
The  quality  of  the  approximation  (2.5)  for  k>2  and  k>3  may  be  seen  in 

Figure  1,  which  shows  the  difference  between  the  d.f.  Fj^(z/ (kir^/3) 

2 1/2 

of  the  standardized  logit  statistic  fj^/Ckir  /3)  and  the  approximation 

for  it.  It  also  displays  the  d.f.'s  for  k ■ 2,3.  Even  for  k > 2 the 
approximation  is  reasonable,  especially  in  the  tails. 

3.  THE ,OOMBINATION  OF  TESTS.  Hie  importance  of  the  classical  combi- 
nation statistics  such  as  Tippett's  Y.J.,  Fisher's  Yp  and  Pearson's  fp  lav, 
at  least  idien  they  were  introduced,  in  the  simplicity  of  their  null 
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distributions.  Since  then  these  have  been  variously  adapted  to  discrete 
distributions  (e.g.  Lancaster  (1949),  E.  S.  Pearson  (1950),  Wallis  (1944)) 
and  generalized  (e.g.  Good  (1955))  with  some  loss  of  the  simplicity. 

The  earliest  systematic  view  of  the  problem  of  combining  one-sided 
independent  tests  described  in  Section  1 is  given  by  Bixnbaum  (1954)  when 
he  initiated  the  study  of  the  admissibility  properties  of  the  combination 
methods.  Among  the  workers  who  have  since  contributed  to  the  development  of 
of  the  theory  of  combining  tests  are  Liptak  (1958) , Lancaster  (1961) , 
Schaafsma  (1968),  van  Zwet  and  Oosterhoff  (1967),  Oosterhoff  (1969), 

Littel  and  Folks  (1971,  1973),  Brown,  Cohen  and  Strawderman  (1976)  and 
George  (1977).  Also  noteworthy  are  the  contributions  by  Davies  and 
Puri  (1967),  Davies  (1969)  and  others,  all  stimulated  by  the  needs  of 
Neyman  and  Scott  (1967)  for  combining  rainfall  data. 

Unlike  the  common  one-parameter  problems  in  statistical  theory,  the 
problem  of  combining  one-sided  independent  tests,  a multiparameter  problem, 
does  not  in  general  admit  solutions  which  are  U.M.P.,  U.M.P.  invariant  or 
U.M.P.  unbiased.  The  theoretical  investigations  of  the  subject  have  there- 
fore centered  either  on  studies  of  properties  such  as  admissibility,  Bayes 
character,  and  most  stringent  character,  on  derivation  of  complete  class 
theorems  for  the  tests,  or  on  studies  of  their  asymptotic  behavior.  In 
this  section  we  review  some  properties  of  the  logit  statistic  f,  when 
used  for  combining  tests  in  the  light  of  these  studies  and  give  some 
supportive  M<mte  Carlo  evidence. 

Finite  .Sample  Studies.  In  order  to  examine  the  combination  methods 
for  independent  one-sided  tests  theoretically,  the  problem  may  also  be 
formulated  in  terms  of  the  distributions  of  the  P-values  instead  of,  as 
in  the  Introduction,  in  terms  of  the  distributions  of  the  test-statistics 
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T^.  Both  Bimbaum  (1954)  and  Liptak  (1958)  consider  the  case  in  which 

the  only  assumption  made  about  the  P-values  is  that  the  density  of  each, 

uniform  under  the  null  hypothesis,  is  nonincreasing  in  the  parameter  under 

the  alternative,  or  equivalently  that  the  family  of  the  distributions  of 

each  has  monotone  likelihood  ratio  (M.L.R.)-  For  this  situation  they 

show  that  any  monotone  level  a combination  test  is  most  powerful  against 

an  alternative,  where  a combination  test  using  a statistic  f (P^^  ,P2 , . . . ,Pj^) 

is  said  to  be  monotone  if  rejection  of  the  overall  null  hypothesis 
k 

Hq  a ^ ^Oi  ^ vector  (P^,P  , ...,Pj^)  of  the  P-values  implies  its 

i=l 

rejection  for  any  (P*,P*, . . . ,P*)  such  that  P^  i i * l,2,..,,k. 

All  the  monotone  methods  are  thus  admissible  in  this  mode.  Bimbaum,  by 
restricting  the  model  by  postulating  exponential  families  for  the  distribu- 
tions of  T^,  shows  that  for  the  admissibility  of  a combination  method, 
its  acceptance  region  must  be  convex  in  the  space  of  T^,  i =>  1,2, . . . ,k. 

In  this  restricted  model,  which  includes  tests  for  normal  means  when  the 
variances  are  known  but  excludes  the  combination  of  the  t-tests  and  the 
analysis  of  variance  tests,  the  logit  method  is  inadmissible.  Liptak  on 
the  other  hand  further  examines  the  subclass  of  tests  based  upon  statistics 

of  the  form  ^u.hfP.)  in  the  original  model,  where  h(*)  is  monotone 
i ^ ^ 

and  are  nonnegative  weights.  He  demonstrates  the  Bayes  character  of 
such  combination  tests  and  shows  that  they  are  unbiased  if  each  component 
test  is  unbiased.  Clearly  the  logit  method,  and  its  weighted  version 

introduced  later,  shares  the  admissibility  (by  virtue  of  being  the 
most  powerful  against  an  alternative),  the  Bayes  character  and  the 

unbiasedness  with  all  classical  combination  tests. 
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r The  combination  tests  have  also  been  studied  by  assuming  that  the 

independent  investigations  are  such  that  the  distribution  of  the  P-values 
under  the  alternatives  may  be  assumed  to  belong  to  specific  parametric 
families  of  distributions,  as,  for  example  in  Yates  (1955).  For  instance, 
if  are  independent  bet a- variables  with  parameters  (a^,b^),  then  the 
null  hypotheses  reduce  to  a^  =•  b^^  » 1,  i = 1 ,2, . . . ,k  and  the 

most  powerful  test  against  simple  alternative  {(aj^.b^^),  i = l,2,,,.,k} 
k k a.-l  b.-l 

rejects  when  II  [P^^  ^ ] is  large.  Thus  Fisher's 

i“l  ^i  i=l 

test  is  U.M.P.  against  all  alternatives  satisfying  {a^^  » a2  “ . . . » a^^  < 1, 
bj  = b^  » . . . = bj^  = 1}  and  the  logit  method  is  U.M.P.  against  all  alternatives 
(a^  ■ a^  = . . . = aj^  < 1,  a^  + b^^  = 2,  i » 1,2, ...  ,k}  . Lancaster  (1961)  has 
developed  a more  interesting  method  of  evaluating  any  combination  method 
at  a specified  alternative  by  using  the  series  representation  of  the 
alternative  distribution  in  terms  of  the  set  of  functions  orthonormal  with 
repsect  to  the  null  distribution.  But  such  analyses, although  very  useful 
locally, are  not  best  suited  to  overall  comparisons  of  the  combination  tests. 

There  are  a number  of  elegant  complete  class  theorems  for  the  multi- 
parameter problems  which  are  either  obtained  in  the  framework  of  the  problem 
of  combining  (not  necessarily  independent)  tests  or  are  applicable  to 
it.  For  example,  in  terms  of  the  partial  order  V for  vectors,  defined 
in  the  Introduction,  Oosterhoff  (1969)  defines  the  joint  density  f(t;0) , 
t,  0 both  vectors,  as  possessing  a strict  M.L.R.  in  t if  for  0 0*  , 

[f (t ;0)/f (t;0*) ] is  strictly  increasing  in  t . If  T^  are  independent 

then  strict  M.L.R.  for  f(t;0)  is  equivalent  to  strict  M.L.R.  for  the  t 

- - i 

density  of  each  T..  For  the  case  of  strict  M.L.R.  if,  in  addition, 
f(t,0)  > 0 for  all  t and  the  family  is  dominated  by  a nonatomic 

\ ^ 
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measure  he  shows  that  the  class  of  monotone  tests  is  essentially  complete. 
Brown,  Cohen  and  Strawderman  (1976)  on  the  other  hand  prove  a complete 
class  theorem  in  a more  general  framework  and  as  an  application  show  that 
under  the  above  conditions  the  class  of  monotone  tests  is  complete.  The 
logit  method  obviously  has  membership  in  such  complete  classes. 

Asj^ptotic  Studies.  Oosterhoff  (1969)  in  his  monograph  describes  and 
uses  a number  of  asymptotic  approaches  to  studying  the  combination  tests. 

For  example,  in  one,  by  appealing  to  the  limiting  distributions  of  the 
test  statistics,  the  problem  is  reduced  to  that  of  combining  tests  for  the 
means  of  normal  variables  with  unit  variances,  and  in  another  the  short- 
coming of  the  tests  as  their  levels  tend  to  zero  is  used  as  the  criterion. 
However,  an  effective  asymptotic  scheme,  which  is  not  discussed  in  the 
monograph,  is  Bahadur's  (e.g.  1971)  method  of  comparing  the  exact  slopes 
adapted  to  the  combination  problem  by  Littel  and  Folks  (1971).  This  method, 
unlike  other  asymptotic  methods  and  the  traditional  approaches  described 
previously,  yields  measures  which  describe  the  operating  characteristics  of 
the  major  combination  tests  over  broad  sets  of  alternatives,  and  success- 
fully narrows  the  class  of  the  contenders  substantially. 

If  the  null  hypothesis  is  false,  then  for  any  fixed  alternative  the 
P-value  P of  any  reasonable  test  converges  to  zero  (exponentially)  as 
the  sample  size  n tends  to  infinity.  The  exact  slope  C(0)  at  an 
alternative  9,  which  measures  the  rate  of  the  decline  of  P with 
respect  to  n,  is  the  a.s.  limit  of  •(2/n)log  P,  provided  that  it  exists. 

The  computation  of  the  exact  slope,  in  general  a nontrivial  task,  often 
relies  upon  a well  known  technique  as  sumnarized,  for  example,  in  Theorem 
7.2  of  Bahadur  (1971),  which  requires  the  large  deviation  probability 
result  for  the  exact  null  distribution  of  the  test  statistic  and  the  a.s. 
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limit  of  the  test  statistic  scaled  appropriately  by  a power  of  n.  In 

the  problem  of  combining  the  k tests,  suppose  that  P^,  the  P-value 

of  the  test  of  9^^  = 0^^,  is  based  upon  a sample  of  size  n^^, 

i = 1,2,.. .,k.  Let  n =>  7 n./k  and  denote  the  d.f.  of  a combination 

^ k 

statistic  S'  = 'y(Pj,P2. . • • .Pj^) . under  Hq  = ^ by  Suppose 

i=l 

that  (i)  (n^/n)  ->•  as  each  n^  (ii)  (H'/i/n)  -►  b(0),  a.s.[0], 

where  0 * 62»  • • • ® ^ ^ finally  (iii) 

-(l/n)log[l  - F^(/n  t )1  -*■  f(t),  as  n-*-»,  where  0 < f(t)  < • , and  f(t) 
is  continuous  at  least  for  t in  the  range  of  b(0).  Then  the  exact 
slope  C(9)  of  the  statistic  'V  at  the  alternative  0 is  given  by 
C(9)  « 2 f(b(9)). 

In  the  analyses  of  the  combination  statistics  composed  of  the  P- 

values,  it  is  assumed  that  the  exact  slope  of  the  component  test  of 

at  an  alternative  9^,  which  may  as  well  be  regarded  as  a function  of 

the  alternative  9 * (9  ,9  ,...,9  ),  is  C. (9)  » lim(-2/n. ) log  P.  a.s., 

i 

i = l,2,...,k.  With  this  assumption  Littell  and  Folks  (1971)  compute 
the  exact  slopes  of  several  combination  statistics  including  'J'p,  'K.j, 

and  In  order  to  compute  the  exact  slope  of  the  logit  statistic 

George  and  Mudholkar  (1977b)  use  the  expressions  for  the  convolu- 
tion of  the  logistic  distributions  obtained  earlier  (1977a)  and  show  that 
the  large  deviation  function  f(t)  for  the  exact  null  distribution  of 


y,  is  the  identity  function,  i.e,  f(t)  ■ t Furthermore,  for  f,  , 
k 

b(9)  * 2 X.C. (9)/2,  Consequently  it  is  concluded  that  the  exact  slope 

i-1  ^ ^ ' 

of  the  logit  combination  test  is 

k 

C,  (0)  - EX.C.  (9),  (3.1) 

L . i.i  ^ ^ - 


1 

i 


tiiAeu 
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the  same  as  the  exact  slope  of  Fisher's  fp  obtained  earlier.  Littell 
and  Folks  later  (1973)  prove  that  the  exact  slope  of  any  monotone 
combination  test  is  no  greater  than  the  exact  slope  of  Fisher's  combination 
test.  Clearly,  therefore,  with  respect  to  the  criterion  of  exact  Bahadur 
A.R.E.,  the  logit  combination  method  is  optimal  in  the  class  of  monotone 
combination  methods. 

It  is  to  be  emphasized  that  for  obtaining  the  exact  slope  as  described 
above,  the  distribution  of  the  combination  statistic  is  needed  only  under 
the  overall  null  hypothesis;  under  the  alternative  only  the  (a.s.) 
limiting  value  of  the  scaled  statistic  is  needed.  As  a consequence,  the 
method  of  analysis  can  be  readily  applied  to  the  problem  of  combining 
tests  based  on  statistics  which  are  independent  under  the  null,  but  not 
necessarily  under  the  alternative  hypothesis.  Using  this  approach, 

Mudholkar  and  Subbaiah  (1977)  show  that  if  the  components  of  Rao's  (1972) 
test  for  additional  information  are  combined  using  Fisher's  method  then 
the  resulting  test  is  equivalent,  in  terms  of  the  exact  slopes,  to  the 
T^-test  based  on  all  variables.  More  generally,  it  is  shown  that  the  same 
equivalence  holds  between  Hotelling's  T -test  for  the  problem  of  testing 
the  significance  of  the  mean  of  a multivariate  normal  population,  and 
the  Fisher  combination  of  the  t-tests  in  J.  Roy's  stepdown  procedure 
adapted  and  investigated  by  them  earlier  (1975,  1976)  for  this  problem. 

Furthermore,  in  this  context,  the  asymptotic  behavior  of  the  logit 
combination,  in  this  sense,  can  be  shown  to  be  identical  to  the  Fisher 
combination. 

Two .SijmilaUon  Studies . Two  Monte  Carlo  experiments,  which  are  in 
progress  and  are  expected  to  give  sosw  indication  to  the  efficacy  of  the 
logit  method  for  combining  tests  as  compared  with  a few  classical  methods. 

are  now  summarized.  In  the  first  experiment  k independent  t-tests  based  L 

!' 

I 

" ' ■ ■ 'Wi  t' 
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on  samples  of  size  n,  for  the  significance  of  the  means  of  k normal 

populations  against  one-sided  alternatives,  are  confined.  In  the  second 

2 

experiment  Hotelling's  T -test  for  the  significance  of  a mean-vector 

using  a sample  of  size  n from  the  p-variate  normal  population  is 

compared  with  the  tests  constructed,  as  in  Mudholkar  and  Subbaiah  (1977) , 

by  variously  combining  the  p quasi -independent  (i.e.  independent  only 

2 

under  the  null  hypothesis)  t -tests  in  J.  Roy's  (1958)  stepdown  method. 

TABLE  1 

ESTIMATED  POWERS*  OF  THE  COMBINATIONS  OF  ONE-SIDED 
INDEPENDENT  T-TESTS,  n » 5,  a » .05. 


Configuration 


Noncentrality  parameter  w 

Test 

0.0 

.2 

.4 

.6 

.8 

1.0 

Fisher 

Logit 

.056 

.054 

.125 

.133 

.280 

.304 

Liptak 

.054 

.134 

.314 

.520 

.731 

.894 

Pearson 

.050 

.136 

.307 

.518 

.727 

.891 

Fisher 

.055 

.081 

.145 

.191 

.291 

.395 

Logit 

.054 

.078 

.144 

.187 

.286 

.355 

Liptak 

.052 

.084 

.141 

.185 

.258 

.291 

Pearson 

.052 

.081 

.130 

.163 

.202 

.232 

Fisher 

.054 

.180 

.445 

.716 

.920 

.985 

Logit 

.048 

.195 

.499 

.781 

.944 

.991 

Liptak 

.048 

.198 

.511 

.796 

.950 

.992 

Pearson 

.051 

.062 

.242 

.522 

.781 

.929 

Fisher 

.054 

.075 

.107 

.147 

.197 

.245 

Logit 

.048 

.077 

.108 

.140 

.180 

.213 

Liptak 

.048 

.074 

.108 

.137 

.169 

.182 

Pearson 

.051 

.072 

.094 

.100 

.120 

.134 

*Each  estimate  is  based  on  3000  samples. 

Because  of  the  invariance  structures  in  the  problems  only  normal 
variables,  with  unit  variance  but  different  means,  need  to  be  simulated, 
for  which  Harsaglia's  (1972)  Super-Duper  package  is  used.  The  P-values 
and  percentiles  are  obtained  using  the  well  known  IMSL  routines.  The 
[lower  fiaiction,  which  involves  k noncentrality  parameters  in  the  first 
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experiment  and  p in  the  second,  is  obtained  on  a fine  grid  in  the 
parametric  space  when  k ■ 2 ■ p.  For  higher  values  of  k and  p, 
several  configurations  are  included  in  the  simulation  but  two  are  always 
used.  In  the  first,  the  noncentrality  is  distributed  equally  among  the 
alternatives  in  all  the  tests,  and  in  the  other  it  is  concentrated 
entirely  in  only  one  of  the  component  tests.  The  above  Table  I and  the 
following  Table  2 give  capsule  summaries  of  the  currently  available 
results  on  the  power  functions  being  estimated  in  the  two  experiments. 


TABLE  2 


ESTIMATED  POWERS*  OF  T^-TEST  AND  THE  COMBINATION  TESTS, 

n ■ 20,  o ■ 0.05. 


Configuration 


Noncentrality  parameter  u 


Test 

0.0 

0.2 

0.4 

0.6 

0.8 

1.0 

T^ 

.055 

.163 

.539 

.875 

.990 

1.00 

Fisher 

.056 

.163 

.548 

.885 

.992 

1.00 

Logit 

.054 

.163 

.546 

.888 

.993 

1.00 

Tippett 

.056 

.150 

.473 

.810 

.971 

1.00 

Pearson 

.049 

.148 

.465 

.797 

.940 

1.00 

T^ 

.049 

.104 

.285 

.581 

.836 

.963 

Fisher 

.048 

.105 

.285 

.578 

.830 

.961 

Logit 

.047 

.103 

.263 

.528 

.784 

.911 

Tippett 

.049 

.106 

.298 

.602 

.866 

.968 

Pearson 

.047 

.087 

.185 

.258 

.286 

.312 

T^ 

.049 

.192 

.629 

.948 

.998 

1.00 

Fisher 

.050 

.195 

.645 

.953 

.998 

1.00 

Logit 

.048 

.192 

.640 

.943 

.997 

1.00 

Tippett 

.046 

.174 

.496 

.844 

.982 

.999 

Pearson 

.052 

.153 

.505 

.796 

.919 

.948 

T^ 

.050 

.091 

.234 

.510 

.772 

.926 

Fisher 

.052 

.095 

.239 

.510 

.763 

.917 

Logit 

.052 

.093 

.224 

.424 

.642 

.819 

Tippett 

.048 

.090 

.254 

.559 

.828 

.958 

Peatton 

.047 

.078 

.130 

.178 

.196 

.194 

C) 

C) 


*Each  estimate  is  based  on  3000  samples. 
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The  consistently  poor  performance  of  Pearson's  statistic  in  the  two 
problems  is  the  most  striking  feature  of  the  two  tables.  In  combining  the 
t-tests,  the  power  of  the  logit  test,  at  the  displayed  alternatives,  is 
between  those  of  Fisher's  and  Liptak's  tests.  Fisher's  test  is  marginally 
superior  along  the  coordinate  axes,  and  Liptak's  is  so  along  the  equi- 
angular line.  However,  results  not  in  the  table  indicate  that  Liptak's 
test  is  considerably  inferior  at  distant  points  along  the  coordinate 
axes  without  possessing  comparable  superiority  when  ■ )i2  ” 

The  logit  test  is  a good  overall  performer  for  this  problem.  For 

Hotelling's  problem,  when  ^ 0 and  U2  **  •..  * ■ 0 , Fisher's 

2 

and  T tests  are  indistinguishable,  and  both  are  superior  to  the  logit 

and  sli^tly  inferior  to  Tippett's  test.  However,  when  the  nor.centrality 

2 

is  equidistributed,  i.e.  ■ U2  ■ • • • ■ Up  0 » T , the  logit  and 
Fisher's  tests  are  indistinguishable  and  all  three  are  superior  to 
Tippett's  test. 

4.  THg.-GOQPNSSSr.QF-FlT  P.ROBLEM.  Let  be  a random 

saaqjle  from  a population  with  the  d.f.  F(*)  and  consider  the  problem  of 
testing  a simple  goodness  of  fit  hypothesis  Hq:  F ■ F^,  where  Fq  is  a 
given  continuous  d.f..  Under  H^  the  probability  integral  transforms 
Yi  - Fq(X^},  i ■ l,2,...,n,  are  similar  in  distribution  to  the  P-values 
in  the  problem  of  combining  one-sided  tests,  i.e.  they  are  uniform  (0,1) 
variables.  Hence  the  combination  statistics  can  also  be  used  for  testing 
Hq.  However  the  analogy  between  the  two  problems  does  not  go  very  far 

I 

beyond  that,  and  the  results  of  the  studies  of  the  statistics  in  combining 
tests  are  only  marginally  relevant  in  the  present  context.  The  major 
factors  differentiating  the  two  problems  Include  the  following  three: 

A 

y- 
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1.  Because  of  the  large  variety  of  possible  alternative  hypotheses, 
neither  small,  nor  large,  or  any  other  specific  kind  of  values  of  may 
necessarily  be  significant  against  The  combination  statistics  are 

therefore  used  for  two-sided  as  well  as  one-sided  tests  of  fit.  2.  In 
asymptotics,  which  comprise  a substantial  portion  of  the  studies,  the 
niBDber  k of  the  test  statistics  is  fixed  and  n^  -»■  <•  in  combining  tests; 
whereas  in  the  goodness  of  fit  problem,  n,  the  sample  size  is  allowed  to 
diverge.  3.  For  the  problem  of  goodness  of  fit,  starting  with  the  work 
of  Kolmogorov  (1939) , a large  class  of  well  known  studies  and  practi- 
cally applicable  alternatives,  to  the  combination  statistics,  has  evolved. 

The  literature  on  the  study  of  the  combination  statistics  in  the 
context  of  the  tests  of  fit  is  not  vast  but  it  is  hard  to  iso^te  it 
fully  from  the  truly  large  body  of  work  on  the  problem  of  goodness*ofH?it 
in  its  generality.  Some  references  to  and  evaluations  of  combination 
statistics  as  test  statistics  for  testing  the  hypotheses  of  fit  may  be 
found  e.g.  in  Chapman  (1958),  Csorgo,  Sheshadri  and  Yalovsky  (1975), 

Hegazy  and  Green  (1975)  and  Lin  (1977) . It  may  be  noted  that  in  the 
problem  of  combining  one-sided  tests  Fisher's  method  possesses  several 
optimality  properties  and  is  widely  used,  but  Pearson's  method  is  not  in 
general  considered  to  be  a contender.  In  the  context  of  goodness -of-f it , 
however,  neither  is  superior  nor  can  be  eliminated  from  consideration. 

The  logit  statistic  which,  in  a way,  is  a compromise  between  the  two  may 
be  expected  to  have  a good  overall  performance.  At  present  we  are  engaged 
in  studies  of  various  aspects  and  properties  of  this  statistic  in  testing 
foodness-of-fit.  In  this  section  some  of  these  are  briefly  outlined. 

The  null  hypothesis  Hg:  F » Fg,  in  parametric  models  such  as  F ■ Fg  , 

I 

or  F ■ 1 - (1  - Pg)  , reduces  to  Hg:  6 ■ 0.  By  considering  various 
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composite  alternatives  in  terms  of  sets  of  e it  is  easy  to  obtain  various 
one-sided  and  two-sided  tests,  based  upon  Fisher's  Vp(Y^ . . . ,Y^)  and 
Pearson's  Yp(Yj ,Y2, . . . ,Y^) , as  the  U.M.P.  or  U.M.P.  unbiased  tests.  In 

the  model  F ■ [F®*^  1 - (1  - Fq)®*^1/2,  0 <.  0 < 1,  also  the  null  hypo- 

thesis is  again  Hq:  6 « 0,  and  for  example,  the  one-sided  test  based  upon 
the  logit  statistic  'i'^(Y^,Y2, . . .,Y^)  is  the  locally  most  powerful  test. 
Other  models  of  this  variety,  their  meaning,  and  the  properties  of  the 
logit  statistic  are  under  investigation. 


In  terms  of  the  distribution  of  Y « ^qCX)  one  such  model  is 


Pr(Y  < y)  - G(y) 


j a(y/a)®,  0 < y ^a  < 1 

W - (1  - a)[(l  - y)/(l  - a)]®,  0 < a < y < 1. 


In  this  model  the  null  hypothesis  reduces  to  Hq:  0 ■ 1,  and  various  two- 
sided  alternatives  are  considered.  These  problems  have  been  reduced  to 
simpler  problems  involving  Lehmann-type  alternatives  by  using  the  transfor- 
mation Z ■ min  [Y/a,  (1  - Y)/(l  - a)].  Chai»nn  C1958)  introduces  a single  and 
elegant  technique  for  evaluating  maximum  and  minimum  powers  of  one-sided  ^ 

tests  of  fit  against  alternatives  at  a fixed  "distance"  from  the  simple  i 

hypothesis.  Using  this  technique  he  shows  that,  in  terms  of  the  maximum  | 

power,  Pearson's  test  is  superior  to  Fisher's  test,  and  also  to  such 
conventional  tests,  as  those  due  to  Anderson  and  Darling,  Cramer  and 
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The  asymptotic  methods  used  in  the  analyses  of  the  goodness-of-fit 
tests  are  more  traditional  and  better  known,  but  they  do  not  reduce  the 
class  of  competing  tests  to  the  extent  accomplished,  using  the  exact 
slopes,  in  the  problem  of  combining  tests.  Asymptotic  analyses  of 
the  logit  statistic  for  testi'^g  the  simple,  and  some  composite  hypotheses 
of  goodness-of-fit,  using  Pitman's,  Bahadur's,  and  Chemoff's  (1952) 
measures  of  A. R. E.  are  in  progress.  We  conclude  this  section  by 
describing  a use  of  the  combination  statistics  for  testing  the  composite 
hypothesis  of  exponentiality  and  an  empirical  comparison  of  Fisher's, 
Pearson's  and  the  logit  statistics  in  this  context. 


C2^ina$i9n.^Lat.i.stics.  for. Testing  ^ponentialitg.  Let  X^,X2, . . .,X^ 
be  nonnegative  i.i.d.  random  variables  and  consider  the  problan  of 
testing  the  composite  hypothesis  that  their  common  distribution  is 
exponential.  Let  D . ■ (n-i-»l) (X...,-X. . ,.),  i»l,2,...,n  denote  the 
normalized  spacings  of  the  ordered  X's,  X^j  —^(2)  — —^(n)’  ^(0)  ” 

Also  let  6j.  ■ ^ r«l,2,...,n,  and  2^,  . • • • .n-1. 

It  is  well  known  that  if,  and  only  if,  the  X's  are  exponentially  distri- 


buted then  (i)  D .'s  i«l,2,...,n  are  i.i.d.  exponentials,  and 

(ii)  r-1 ,2, . . . ,n-l  are  distributed  as  the  (n-l)  uniform  order 

statistics. 

In  view  of  these  characterizations,  CsSrgo,  Sheshadri  and  Yalovsky 
(1975)  suggest  the  Pearson  statistic  Yq  * ^ log  Z . for  testing 

* * •Tl*X 

exponentiality  of  the  X's.  Obviously,  there  are  several  competitors 

including  other  combination  statistics  Yp  * *2  ^ log 

Tl  "(Tp  - yp)/^and  empirical  distribution  function  statistics  such  as 
2 

Anderson-Oarling  A , ail  to  be  used  in  two-sided  tests.  The  following 


Table  3 contains  a small  illustrstive  selection  of  the  results  of  a 
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TABLE  3 

EMPIRICAL  POWERS*  OF  THE  FOUR  TESTS  OF  EXPONENTIALITY; 

n * 20,  a « .1 


Alternative 

Fisher 

TEST 

Pearson  Logit 

Anderson-Darl ing 

v2 

’^l 

.51 

.81 

.82 

.73 

X2  = Exponential 

.100 

.100 

.102 

.103 

.20 

.31 

.30 

.25 

v2 

*8 

.99 

1.00 

.93 

.91 

Lognormal  (0,1) 

.20 

.14 

.14 

.22 

Weibull  (.5) 

.84 

.98 

.97 

.97 

Weibull  (2.0) 

.97 

.97 

.98 

.97 

Beta  (2,1) 

.46 

.28 

.47 

.40 

* Each  estimate  is 

based  upon 

1000  saiqples 

Monte  Carlo  experinent  perfoxned  with  a view  to  comparing  these  as  the 
tests  of  exponentiality,  over  several  alternatives.  The  parameters  of 
the  simulation,  which  uses  Marsaglia's  (1972)  Super-Duper  package  as  a 
basis  for  generating  various  random  variables,  are  n > 20,  a ■ .1,  and 
each  estimate  is  based  upon  1000  samples.  More  detailed  results  are 
given  by  Lin  (1977) . 

5.  REMARKS.  The  following  miscellaneous  comments  stated  in  terms 
of  the  problem  of  combining  tests  are  also  relevant  to  the  other 
applications  of  the  combination  statistics. 

1.  Weiahted  Loait  Statistic.  Analogous  to  the  weighted  version 

n of  Fisher's  statistic,  the  weij^ted  logit  statistic  with 

weights  ial,2,...,k,  is  given  by 

k 

’'l,-  • I lOitVa-Pi)!-  (5.1) 

Under  the  null  hypothesis,  ^ may  be  approximated  in  law  by  a scaled 
t-variable  C.t^.  The  constant  C and  the  d. f.  parameter  v are 
determined  by  (i)  equating  the  variances:  C^v/(v-2)  ■ ^ u^/3,  and 

(ii)  equating  the  excesses  of  kuztosis:  6/(v-4)  > (1.2)  ^ (»^/(I 


I 
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The  quality  of  this  approximation  is  expected  to  be  similar  to  that  in 
Section  2. 

2.  Selection  of  the  Weights.  There  is  no  obvious  or  simple  method 

known  for  selecting  of  the  weights  If  there  is  some  clue  to  the 

alternative  hypotheses  then  it  may  indicate  some  approach  to  the  solution 
as  indicated  in  Section  3.  Otherwise,  one  may  attempt  to  use  the 
asymptotic  argument  suggested  by  Lancaster  (1961),  or  J.  Hemelrijk's 
adaptive  approach  discussed  by  Oosterhoff  (1969) , 

3.  Discrete  P-values.  If  the  P-values  are  discrete  then  one  method 
is  to  use  the  randomized  probability  integral  transforms,  which  lead  to 
randomized  tests.  An  alternative,  which  leads  to  nonrandomized  tests,  is 
to  replace  the  logits  log[Pj^/(l-Pp] , using  arguments  similar  to 
Lancaster  (1949),  by  their  conditional  expectation,  or  simpler  approxi- 
mations for  them.  This  method  makes  such  adjustments  in  the  expectation 
and  the  variance  of  the  statistic  as  to  render  the  error  due  to  ignoring 
discreteness  negligible. 

4.  Two-sided  P-values.  The  Z-transformation  mentioned  in  Section  4, 
in  the  context  of  Chapman's  method,  is  introduced  in  George  (1977)  as  a 
one-parameter  family,  min[F^(T)/X,  (l-Fp(T))/(l-X)l , 0 ^ X ^ 1,  of  two- 
sided  P-values.  A manuscript  discussing  the  choice  of  the  parameter  X 
and  various  applications  is  in  preparation. 

Exact  Slopes  and  Power.  It  is  now  generally  recognized  that  the 
relationship  between  the  two  methods  of  comparing  tests,  viz.  in  terms  of 
the  power  and  the  exact  slopes,  is  tenuous.  A reasonable  empirical 
approach  to  investigating  the  finite  saiqple  behavior  of  tests  with  equal 
slopes  is  to  focus  on  estimating  soaw  location  paramter  the  P-values  | 

possibly  trmsforasd  by  the  logs  or  logits.  | 
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6.  Two-Sample  Problem.  Several  aspects  of  the  logit  statistic  and 
the  Z transformation  in  the  context  of  the  two-sample  problem  and  its  one- 
sample  limits,  as  outlined  in  Chapter  S of  George  (1977),  are  under  study. 

ACjgjOW^PGEMEyrS . The  authors  are  thankful  to  Professors  C.C,  Lin, 

P.  Subbaiah,  and  Mr.  F.C.  Pun  for  discussions  and  computational  assistance 
and  to  Professors  M.L.  Davidson  and  W.J.  Hall  for  comments  and  suggestions 
concerning  the  manuscript. 
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